polars: GroupBy.agg()
from PolarsのExpression
polars.DataFrame.group_by()
https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.dataframe.group_by.GroupBy.agg.html
https://docs.pola.rs/user-guide/expressions/aggregation/
code:py
def compute_age():
return date.today().year - pl.col("birthday").dt.year()
def avg_birthday(gender: str) -> pl.Expr:
return (
compute_age()
.filter(pl.col("gender") == gender)
.mean()
.alias(f"avg {gender} birthday")
)
q = (
dataset.lazy()
.group_by("state")
.agg(
avg_birthday("M"),
avg_birthday("F"),
(pl.col("gender") == "M").sum().alias("# male"),
(pl.col("gender") == "F").sum().alias("# female"),
)
.limit(5)
)
普通にagg内で参照すればpolars: List型になる
code:py
import polars as pl
df = pl.DataFrame(
{
"group": "a", "b", "a", "b",
"values": 1, 2, 3, 4,
}
)
grouped_df = df.group_by("group").agg(pl.col("values"))
grouped_df
code:result
shape: (2, 2)
┌───────┬───────────┐
│ group ┆ values │
│ --- ┆ --- │
│ str ┆ listi64 │
╞═══════╪═══════════╡
│ a ┆ 1, 3 │
│ b ┆ 2, 4 │
└───────┴───────────┘